Reducing Cache Misses

Reducing Cache Misses

When the frame rate is high you become concerned, not with the loss of milliseconds to a page fault, but with the loss of microseconds to a cache miss. When your program accesses instructions or data that are not in cache memory (see Figure 2-1 on "Multiprocessor Architecture" and "Memory Hierarchy"), the CPU requests a load of a "cache line" of 128 bytes from main memory. Possibly hundreds of CPU clock cycles pass while the cache is being loaded. Due to the pipeline architecture of the CPU, it can often continue to work during this delay. However, multiple successive cache misses can bring effective work to a halt for tens of microseconds.

In a normal program, delays due to cache misses are not noticeable because the overall average speed of the program is satisfactory. However, for a real-time program with a frame rate above 50 Hz, a cache miss can cause the unpredictable loss of a useful fraction of one frame interval.

Note: In addition to the following guidelines, the IRIX kernel assists you in maintaining good cache use with special scheduling rules. See "Understanding Affinity Scheduling".

Locality of Reference
Cache Mapping in Challenge/Onyx
Multiprocessor Cache Conflicts
Detecting Cache Problems